Datacenter workload evaluation

ABSTRACT

A method is provided for evaluating workload consolidation on a computer located in a datacenter. The method comprises inflating a balloon workload on a first computer that simulates a consolidation workload of a workload originating on the first computer and a workload originating on a second computer. The method further comprises evaluating the quality of service on the first computer&#39;s workload during the inflating and transferring the workload originating on either the first or the second computer to the other of the first or second computer if the evaluating the quality of service remains above a threshold.

BACKGROUND

Datacenters with several servers or computers having variable workloadsmay wish to consolidate workloads by transferring a workload from onemachine (the migrating machine) to a second machine (the destinationmachine) having a preexisting workload. The decision to consolidate theworkloads onto the destination machine may be based upon any number ofreasons, including for example, a desire to save power, relocate theworkload to an area in the datacenter offering better cooling orventilation, a desire to move the workload from an under utilizedmachine to a more utilized machine, to reduce cost on leased hardware,or to reduce cost on licensed software.

When consolidating workloads onto a destination machine, it is difficultto predict the impact in the quality of service (QOS) on the computer orserver receiving the additional workload. Current methods fordetermining workload transference simply “add-up” the resources (e.g.,CPU, Memory, and IO) demanded by the resources used between the targetand migrating machines. Such approach however does not account forconflicts that can arise that would prevent the new and existingworkloads from working well together on a single machine. Interferencesoften arise at some level between the additional and existing workloadsthat cannot be accounted for by the current additive methods forevaluating workload transference. As such, the QOS is compromised andthe workload is typically transferred back from the destination machineto the migrating machine, incurring both costs and time as a result tothe datacenter.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates one example embodiment of a datacenter structured forworkload evaluation.

FIG. 2 illustrates an example embodiment of a general purpose computersystem.

FIG. 3 illustrates the example embodiment of FIG. 1 in which a workloadin a migrating computer is evaluated for consolidating with a workloadon a destination computer.

FIG. 4 illustrates an example embodiment of a datacenter structured forworkload evaluation.

FIG. 5 illustrates a flow diagram of an embodiment employing workloadevaluation management for workload consolidation on a destinationcomputer.

FIG. 6 illustrates a flow diagram of an alternative embodiment employingworkload evaluation management for workload consolidation on adestination computer.

DETAILED DESCRIPTION

With reference now to the figures, and in particular with reference toFIG. 1, there is depicted a datacenter 100 utilizing workload evaluationmanagement through workload manager 105 between a plurality of computers110-150. The workload manager 105 can be a stand alone component ordistributed among the plurality of computers 110-150 in the datacenter100. The workload manager 105 employs a management program thatautomatically optimizes the datacenter's operations. The optimization bythe workload manager 105 management program may seek to improve in thedatacenter for example, performance, reduce power consumption, reducecooling problems, allow for maintenance, avoid failing hardware, or anyother goal set.

The workload evaluation management through the workload manager 105program simulates a consolidation workload without actually moving aworkload. The consolidated workload simulation occurs on a computertargeted for workload transfer (the migration computer) in order toevaluate whether the quality of service (QOS) (e.g., utilization ofmemory, IO, CPU resources) is acceptable if the consolidation were tooccur on a computer targeted for workload consolidation (the destinationcomputer). In addition, the workload manager 105 program can be expandedto simulate the impact of the QOS in the transfer of the workload. Ifthe workload manager 105 determines that the QOS in either of theconsolidated workload simulations is not acceptable, i.e., the resultingutilization of resources (e.g., memory, IO, or CPU) was too low in themigration computer (indicating a decline of QOS in the migrationworkload) as a result of the simulated consolidation or transfer, thetransfer of the workload and consolidation to the destination computeris avoided. The cost savings relating to the consolidation avoidancesaves not only time and expense of transferring the workload errantly tothe destination computer, but also provides savings by avoidingtransferring the workload back to the migration computer. In a similarfashion, the impact of migration of a workload migration on thedestination computer can be determined by running a balloon workload onthe destination computer simulating the additional load to be imposed bythe migrating workload.

Referring again to FIG. 1, the computers 110-150 are in communicationwith each other by wired or wireless communication links 160. While theterm computers is being used throughout, it is intended that the termis, and remains synonymous with central processing units (CPUs),workstations, servers, and the like and is intended throughout toencompass any and all of the examples referring to computers discussedherein and shown in each of the figures.

FIG. 2 illustrates in more detail, any one or all of the plurality ofcomputers 110-150 in an example of an individual computer system 200that can be employed to implement systems and methods described herein,such as based on computer executable instructions running on thecomputer system. The computer system 200 can be implemented on one ormore general purpose networked computer systems, embedded computersystems, routers, switches, server devices, client devices, variousintermediate devices/nodes and/or stand alone computer systems.Additionally, the computer system 200 can be implemented as part of anetwork analyzer or associated design tool running computer executableinstructions to perform methods and functions, as described herein.

The computer system 200 includes a processor 202 and a system memory204. A system bus 206 couples various system components, including thesystem memory 204 to the processor 202. Dual microprocessors and othermulti-processor architectures can also be utilized as the processor 202.The system bus 206 can be implemented as any of several types of busstructures, including a memory bus or memory controller, a peripheralbus, and a local bus using any of a variety of bus architectures. Thesystem memory 204 includes read only memory (ROM) 208 and random accessmemory (RAM) 210. A basic input/output system (BIOS) 212 can reside inthe ROM 208, generally containing the basic routines that help totransfer information between elements within the computer system 200,such as a reset or power-up.

The computer system 200 can include a hard disk drive 214, a magneticdisk drive 216, e.g., to read from or write to a removable disk 218, andan optical disk drive 220, e.g., for reading a CD-ROM or DVD disk 222 orto read from or write to other optical media. The hard disk drive 214,magnetic disk drive 216, and optical disk drive 220 are connected to thesystem bus 206 by a hard disk drive interface 224, a magnetic disk driveinterface 226, and an optical drive interface 228, respectively. Thedrives and their associated computer-readable media provide nonvolatilestorage of data, data structures, and computer-executable instructionsfor the computer system 200. Although the description ofcomputer-readable media above refers to a hard disk, a removablemagnetic disk and a CD, other types of media which are readable by acomputer, may also be used. For example, computer executableinstructions for implementing systems and methods described herein mayalso be stored in magnetic cassettes, flash memory cards, digital videodisks and the like. A number of program modules may also be stored inone or more of the drives as well as in the RAM 210, including anoperating system 230, one or more application programs 232, otherprogram modules 234, and program data 236.

A user may enter commands and information into the computer system 200through user input device 240, such as a keyboard, a pointing device(e.g., a mouse). Other input devices may include a microphone, ajoystick, a game pad, a scanner, a touch screen, or the like. These andother input devices are often connected to the processor 202 through acorresponding interface or bus 242 that is coupled to the system bus206. Such input devices can alternatively be connected to the system bus206 by other interfaces, such as a parallel port, a serial port or auniversal serial bus (USB). One or more output device(s) 244, such as avisual display device or printer, can also be connected to the systembus 206 via an interface or adapter 246.

The computer system 200 may operate in a networked environment usinglogical connections 248 (representative of the communication links 160in FIG. 1) to one or more remote computers 250 (representative of any ofthe plurality of computers 110-150 in FIG. 1). The remote computer 250may be a workstation, a computer system, a router, a peer device orother common network node, and typically includes many or all of theelements described relative to the computer system 200. The logicalconnections 248 can include a local area network (LAN) and a wide areanetwork (WAN).

When used in a LAN networking environment, the computer system 200 canbe connected to a local network through a network interface 252. Whenused in a WAN networking environment, the computer system 200 caninclude a modem (not shown), or can be connected to a communicationsserver via a LAN. In a networked environment, application programs 232and program data 236 depicted relative to the computer system 200, orportions thereof, may be stored in memory 254 of the remote computer250.

Each of the computer systems 200 in the plurality of computers 110-150of the datacenter 100 may be running different or similar operatingsystems and/or applications. Further, each of the computers 110-150 mayinclude a workload varying in size. For example, computers 110 and 150include Workload A and Workload E, respectively acting as web servers,computer 130 includes Workload C acting as a print server, and computer120 includes Workload B acting as an application server.

Once the migration and destination computers are targeted, the workloadevaluation management employs the workload manager 105 program toinflate a balloon workload 170 on the migration computer. The inflatedballoon workload 170 simulates a consolidated workload that includes aworkload originating on a target migration computer with a simulatedworkload modeling a workload running on a target destination computerwithout having to transfer any of the workloads from the migration ordestination computers. FIG. 3 illustrates such an example that comprisesa consolidated workload that includes both workload A (that originatedin the targeted migrating computer 110) and a balloon workload 170simulating the addition of workload C currently on the destinationcomputer 130. Concurrently, the destination computer 130 continues toprocess workload C without interruption from the simulation occurring bythe balloon workload 170 on migration computer 110.

The balloon workload 170 mimics the workload C that is already runningon the destination computer 130, if consolidation occurs, by the use ofparameters sent by the workload manager 105 for workload C. The balloonworkload 170 uses the resources (e.g., CPU, memory, IO) in the migrationcomputer 110 to mimic the resource consumption to be used in thedestination computer 130 by workload C. The parameters sent to theballoon workload 170 by the workload manager 105 account for thedifferences in utilization, speeds, and bandwidth of the migration anddestination computers, 110, 130, respectively. In addition, the balloonworkload 170 includes parameters that are established by the workloadmanager 105 to accept the destination computer's demand rate forimportant resource classes.

The balloon workload 170 originates on each of the computers 110-150 inthe datacenter 100 where it remains deflated until instructed to inflatei.e., an execution command is initiated by the workload manager 105.Alternatively, the workload manager 105, a remote computer outside thedatacenter 100, a computer located within the datacenter, or a computeroperator may selectively install or transmit the balloon workload 170onto the targeted migration computer, where it remains deflated untilinstructed to inflate by the workload manager. When deflated the balloonworkload utilizes minimal resources.

FIG. 4 illustrates a datacenter 300 employing workload evaluationmanagement led by a workload manager 302. The workload manager includesa workload management program 303 and a management database 304. Oncethe balloon workload 170 is inflated in the migration computer 310, theworkload manager 302 evaluates the migration computer's performance. Inparticular, the workload manager 302 through its program 303 may look atinformation internal to the balloon workload 170, the migration workloadexecutable rates, input/output (IO) rates, central processing unit (CPU)execution rates, and the like. Further, the workload manager 302 cansimulate and evaluate the impact on the QOS of the actual transferenceof the workload from the migration computer to the destination computer.

The evaluations of such resources are used by the workload managementprogram 303 to determine whether the workload in the migration computershould be transferred to the destination computer. In the illustratedexample of FIG. 4, the evaluations of the resources by the workloadmanagement program 303 are performed on migration computer 310 duringthe balloon workload 170 inflation period. The balloon workload 170,during the inflation period simulates the running of preexistingworkload A with workload C found on the target destination computer 330.

If the workload evaluation performed on the migration computer 310appears to be satisfactory to the workload manager 302, i.e., theresources continue to operate above a threshold that provides anacceptable QOS, the balloon workload 170 deflates instantaneously andworkload A is transferred from the migration computer 310 to thedestination computer 330 for workload consolidation. The workloadtransfer may be achieved by many different means, including conventionalmeans such as physically transferring the workload from one computer toanother or more modern means such as a migration of guest operatingsystems from one hypervisor (also referred to as a virtual machinemonitor) to another.

If the workload evaluation performed on the migration computer 310appears to be unsatisfactory to the workload manager 302, i.e., theresources are found to operate below a threshold that provides a lessthan acceptable QOS, the balloon workload 170 deflates instantaneouslyand the transfer of the workload A from the migration computer 310 tothe destination computer 330 is avoided. By deflating quickly, theinterval of time when the migration workloads QOS is perturbed by theexperiment is minimized.

The workload manager 302 in its evaluation of the migration computer'sresources with the balloon workload 170 inflated may also consultevaluation data 305, which includes historical information 306 ofworkloads on the computers 310-350 in the datacenter 300 and previoustransfer history relating to workload compatibility 307. The workloadcompatibility 307 is based on historical consolidations compiledautomatically or manually by exogenous input. The evaluation data 305further includes real-time update capability 308, which providesreal-time information on balloon workload 170 simulations that areoccurring in the datacenter 300 to the historical information 306 and/orworkload compatibility 307 databases. Similarly, the evaluation data 305also includes input capabilities 309 from consolidated computers,providing information relating to efficiencies after consolidation. Theinformation from the input capabilities 309 is used in the historicalinformation 306 and/or workload compatibility 307 databases.

FIG. 5 illustrates a flow diagram of a workload evaluation managementmethodology 400 for determining whether a workload operating in amigrating computer is a viable candidate for consolidation with adifferent workload operating on a destination computer. The workloadevaluation methodology 400 can be generated from computer readablemedia, such as software or firmware residing in the computer, discretecircuitry such as an application specific integrated circuit (AISC), orany combination thereof.

The methodology starts at 410 wherein a hypervisor, workload manager302, or human initiates a search for migration and destination computercandidates within the datacenter. At 420, a search for a migration anddestination computer is commenced. The search performed at 420 couldutilize the evaluation data 305 found in the management database 304 inevaluating potential migration and destination candidates. At 430,migration and destination computers are identified. At 440, a balloonworkload is inflated on the migration computer. The balloon workloadinflation simulates a consolidation workload, combining the existingworkload on the migration computer with a simulated workload found onthe destination computer. As such, a new environment is constructed onthe migration computer. At 450, an evaluation is made as to whether thethroughput declined or resources consumed during the balloon workloadinflation increased. Stated another way, an evaluation is made as towhether the QOS threshold was maintained during the balloon simulation.Should direct measurement of the migration workload QOS not be possiblethe impact on QOS can be inferred from the resource consumption of themigration workload. If resource consumption drops, then it is likely theQOS (throughput or response time) has been adversely impacted. If theresult of the evaluation is (NO) that is, the resources consumptiondecreased on the migration workload a decision is made to avoid thetransfer of the workload residing in the migration computer to thedestination computer for workload consolidation. At 452, the balloonworkload is deflated and a search for a new migration or destinationcomputer occurs. Alternatively, the workload evaluation managementmethodology 400 may terminate at this point. At 454, the results in theevaluation at 450 are recorded in the management database 304. If theresult of the evaluation is (YES) that is, the resources increased orwere maintained above a threshold, a decision is made to transfer theworkload residing in the migration computer to the destination computerfor workload consolidation at 460. An alternative methodology mayinclude yet another step, evaluating the consolidation and QOS at a timeperiod after the consolidation step 460.

FIG. 6 illustrates a flow diagram of a workload evaluation managementmethodology 500. The methodology 500 is for evaluating workloadconsolidation on a computer located in a datacenter. At 510, a balloonworkload is inflated on a first computer, simulating a consolidationworkload of a workload originating on the first computer and a workloadoriginating on a second computer. At 520, an evaluation is made relatingto the resources used on the first computer during the inflating. At530, a transferring the workload originating on either the first or thesecond computer to the other of the first or second computer occurs ifthe evaluating of the resources remained above a threshold.

What have been described above are examples of the present invention. Itis, of course, not possible to describe every conceivable combination ofcomponents or methodologies for purposes of describing the presentinvention, but one of ordinary skill in the art will recognize that manyfurther combinations and permutations of the present invention arepossible. Accordingly, the present invention is intended to embrace allsuch alterations, modifications and variations that fall within thespirit and scope of the appended claims.

1. A method for evaluating workload consolidation on a computer locatedin a datacenter comprising: inflating a balloon workload on a firstcomputer that simulates a consolidation workload of a workloadoriginating on the first computer and a workload originating on a secondcomputer; evaluating the quality of service on the first computer duringthe inflating; and transferring the workload originating on either thefirst or the second computer to the other of the first or secondcomputer if the quality of service is determined to remain above athreshold.
 2. The method of claim 1, wherein the evaluating the qualityof service is achieved by observing the resource consumption on thefirst computer.
 3. The method of claim 2, wherein transferring theworkload originating on either the first or the second computer to theother of the first or second computer if the evaluating of the qualityof service had been determined to remain above a threshold comprisestransferring the workload originating on the first computer to thesecond computer.
 4. The method of claim 1, further comprising supplyingthe balloon workload with parameters that account for differencesbetween the first and the second computer.
 5. The method of claim 1,further comprising transmitting a balloon workload to the firstcomputer.
 6. The method of claim 1, further comprising installing aballoon workload on at least one computer located in the datacenter, theinstalled balloon workload remaining dormant until receivinginstructions by a workload manager to inflate.
 7. The method of claim 2,wherein the observing the resource consumption on the first computerduring the inflating comprises assessing the first computer's executablerates, input/output rates, and/or central processing execution ratesduring inflation.
 8. The method of claim 2, wherein the observing theresource consumption on the first computer during the inflatingcomprises assessing the resource consumption as a result of transferringthe workload originating on either the first computer or second computerto the other of the first or second computer.
 9. The method of claim 2,wherein the observing the resource consumption on the first computerfurther comprises comparing the resource consumption to historicalresource information found in a database.
 10. The method of claim 2,wherein the observing the resource consumption on the first computerfurther comprises comparing the resource consumption to workloadcompatibility information found in a database.
 11. The method of claim1, further comprising updating a database with real-time informationrelating to resource utilization in existing consolidation computerslocated in the datacenter.
 12. The method of claim 2, wherein theobserving the resource consumption on the first computer is performed bya source outside of the datacenter.
 13. A system for evaluating workloadconsolidation on a computer located in a datacenter, the systemcomprising: a balloon workload located on a first computer that createsa consolidation workload that simulates workloads found on the firstcomputer and workloads found on a second computer; a workload managerthat evaluates one of quality of service and resources used on the firstcomputer during the simulation of the consolidated workload; and apredetermined threshold in the workload manager that allows for thetransference of the workloads found in either of the first computer orthe second computer to the other of the first or the second computer ifthe one of quality of service and resources evaluated by the workloadmanager during the simulation of the consolidated workload remains abovethe predetermined threshold.
 14. The system of claim 13, wherein theworkload manager transfers the workloads found in the first computerfrom the first computer to the second computer for workloadconsolidation if it determines that the one of quality of service andresources used during the simulation of the consolidated workloadremains above a predetermined threshold.
 15. The system of claim 13,further comprising parameters provided to the balloon workload by theworkload manager that account for differences between the first and thesecond computer during the workload consolidation simulation.
 16. Thesystem of claim 13, further comprising a database having historicalresource information used by the workload manager during the evaluationof the one of quality of service and resources used on the firstcomputer during the simulation of the consolidated workload.
 17. Thesystem of claim 13, further comprising a database having workloadcompatibility information used by the workload manager during theevaluation of the one of quality of service and resources used on thefirst computer during the simulation of the consolidated workload. 18.The system of claim 13, further comprising an input to a workloadmanager database that allows for real-time information relating resourceutilization in existing prior workload consolidation computers withinthe datacenter.
 19. The system of claim 13, wherein the workload managerthat evaluates the one of quality of service and resources used on thefirst computer during the simulation of the consolidated workload islocated outside of the datacenter.
 20. A computer readable medium havingcomputer executable instructions for performing a method comprising:inflating a balloon workload on a first computer that simulates aconsolidation workload of a workload originating on the first computerand a workload originating on a second computer at the time of theinflating; evaluating the quality of service on the first computerduring the inflating; and transferring the workload originating on thefirst computer for consolidation with the workload originating on thesecond computer if the evaluating of the quality of service remainsabove a threshold during the inflation of the balloon workload.
 21. Thecomputer readable medium having computer executable instructions forperforming the method of claim 20, wherein the evaluating the quality ofservice is achieved by observing the resource consumption on the firstcomputer.
 22. The computer readable medium having computer executableinstructions for performing the method of claim 21, wherein theevaluating the quality of service on the first computer during theinflating includes the impact on the quality of service as a result oftransferring the workload originating on the first computer from thefirst computer to the second computer.